AITopics | Minnesota

Collaborating Authors

Minnesota

What if It's Not the Phones?

The Atlantic - TechnologyJul-9-2026, 11:00:00 GMT

An evolutionary psychologist is challenging the popular understanding of kids and technology. W hen the 82-year-old psychologist Peter Gray describes the way he grew up, he punctuates the anecdotes by saying that modern parents would be arrested for letting a child have such fun. When he was 4 years old, he would walk to a store in Minneapolis to buy cigarettes for his grandmother. When he was 11, he would sometimes stay home from school in Hill City, Minnesota, to operate a newspaper printing press owned by his mother and stepfather. His parents were not arrested, and that's because the childhood they permitted him to have was basically normal at the time, even if his family did have a newspaper printing press in the house. As a boy, Peter was obsessed with fishing and baseball; neighborhood friends taught him how to ride his bike and catch grasshoppers. Although Gray's career as a scientist would begin with laboratory studies of rat hormones, he eventually found his way to writing about his childhood, in a fashion.

artificial intelligence, haidt, social media, (12 more...)

The Atlantic - Technology

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.24)

Genre: Personal (0.68)

Industry:

Media (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.98)

Technology:

Information Technology > Communications (0.52)
Information Technology > Artificial Intelligence (0.47)

Add feedback

Spotify Confirms Streaming Fraud After Kalshi Trader Cries Foul

WIREDJul-2-2026, 18:30:59 GMT

One of Kalshi's most prominent traders tells WIRED he's swearing off Spotify-related markets until the issue is resolved. Top Kalshi trader Caleb Davies usually speaks to the press about how prediction markets help him rake in money. The Minneapolis-based IT worker estimates he's made $1.2 million overall across different prediction platforms, with $414,000 in winnings from Kalshi's culture markets alone. He especially enjoys wagering on music charts, because he carefully analyzes Spotify data to pick winners. "Every single morning, I'm going in, downloading the data, and updating my projections," he tells WIRED.

artificial intelligence, polymarket, wired, (12 more...)

WIRED

Country:

Europe (0.96)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.25)

Industry:

Leisure & Entertainment (1.00)
Banking & Finance > Trading > Prediction Market (1.00)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Mobile (0.30)

Add feedback

On the Entropy Calibration of Language Models

Neural Information Processing SystemsJun-23-2026, 08:56:40 GMT

We study the problem of entropy calibration, which asks whether a language model's entropy over generations matches its log loss on human text. Past work found that models are miscalibrated, with entropy per step increasing as generations grow longer, due to error accumulation. To calibrate the model and improve text quality, it has become standard practice to truncate the distribution, but this approach reduces output diversity, which we would like to avoid. Therefore, in this paper, we ask: does miscalibration improve automatically with scale, and if not, is it theoretically possible to calibrate without tradeoffs? To build intuition, we first study a simplified theoretical setting to characterize the scaling behavior of miscalibration with respect to dataset size. We find that the rate of scaling depends on the power law exponent of the data distribution -- in particular, for a power law exponent close to 1, the scaling exponent is close to 0, meaning that miscalibration improves very slowly with scale.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia (0.92)
North America > United States > Pennsylvania (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Set-LLM: APermutation-Invariant LLM

Neural Information Processing SystemsJun-23-2026, 07:46:33 GMT

While large language models (LLMs) demonstrate impressive capabilities across numerous applications, their robustness remains a critical concern. This paper is motivated by a specific vulnerability: the order sensitivity of LLMs. This vulnerability manifests itself as the order bias observed when LLMs decide between possible options (for example, a preference for the first option) and the tendency of LLMs to provide different answers when options are reordered. The use cases for this scenario extend beyond the classical case of multiple-choice question answering to the use of LLMs for multidocument tasks and as automated evaluators in AI pipelines. We introduce Set-LLM, a novel architectural adaptation for pretrained LLMs that enables the processing of mixed set-text inputs with permutation invariance guarantees. The adaptations involve a new attention mask and new positional encodings specifically designed for sets. We provide a theoretical proof of invariance and demonstrate through experiments that Set-LLM can be trained effectively, achieving comparable or improved performance and maintaining the runtime of the original model, while altogether eliminating order sensitivity.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.92)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Government (0.46)
Information Technology (0.46)
Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Refusal Direction is Universal Across Safety-Aligned Languages

Neural Information Processing SystemsJun-23-2026, 06:21:03 GMT

Refusal mechanisms in large language models (LLMs) are essential for ensuring safety. Recent research has revealed that refusal behavior can be mediated by a single direction in activation space, enabling targeted interventions to bypass refusals. While this is primarily demonstrated in an English-centric context, appropriate refusal behavior is important for any language, but poorly understood. In this paper, we investigate the refusal behavior in LLMs across 14 languages using PolyRefuse, a multilingual safety dataset created by translating malicious and benign English prompts into these languages. We uncover the surprising cross-lingual universality of the refusal direction: a vector extracted from English can bypass refusals in other languages with near-perfect effectiveness, without any additional fine-tuning. Even more remarkably, refusal directions derived from any safety-aligned language transfer seamlessly to others. We attribute this transferability to the parallelism of refusal vectors across languages in the embedding space and identify the underlying mechanism behind cross-lingual jailbreaks. These findings provide actionable insights for building more robust multilingual safety defenses and pave the way for a deeper mechanistic understanding of cross-lingual vulnerabilities in LLMs.1

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
Europe (0.93)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Consumer Health (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

3BASiL: An Algorithmic Framework for Sparseplus Low-Rank Compression of LLMs

Neural Information Processing SystemsJun-23-2026, 04:21:25 GMT

Sparse plus Low-Rank (S + LR) decomposition of Large Language Models (LLMs) has emerged as a promising direction in model compression, aiming to decompose pre-trained model weights into a sum of sparse and low-rank matrices W S + LR. Despite recent progress, existing methods often suffer from substantial performance degradation compared to dense models. In this work, we introduce 3BASiL-TM, an efficient one-shot post-training method for (S + LR) decomposition of LLMs that addresses this gap. Our approach first introduces a novel 3-Block Alternating Direction Method of Multipliers (ADMM) method, termed 3BASiL, to minimize the layer-wise reconstruction error with convergence guarantees.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.27)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

TabSTAR: ATabular Foundation Model for Tabular Data with Text Fields

Neural Information Processing SystemsJun-23-2026, 03:57:27 GMT

While deep learning has achieved remarkable success across many domains, it has historically underperformed on tabular learning tasks, which remain dominated by gradient boosting decision trees. However, recent advancements are paving the way for Tabular Foundation Models, which can leverage real-world knowledge and generalize across diverse datasets, particularly when the data contains free-text. Although incorporating language model capabilities into tabular tasks has been explored, most existing methods utilize static, target-agnostic textual representations, limiting their effectiveness. We introduce TabSTAR: a Tabular Foundation Model with Semantically Target-Aware Representations. TabSTAR is designed to enable transfer learning on tabular data with textual features, with an architecture free of dataset-specific parameters. It unfreezes a pretrained text encoder and takes as input target tokens, which provide the model with the context needed to learn task-specific embeddings. TabSTAR achieves state-of-the-art performance for both medium-and large-sized datasets across known benchmarks of classification tasks with text features, and its pretraining phase exhibits scaling laws in the number of datasets, offering a pathway for further performance improvements.1

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Consumer Products & Services (1.00)
Banking & Finance (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.92)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

Pre-Trained Policy Discriminators are General Reward Models

Neural Information Processing SystemsJun-23-2026, 03:36:28 GMT

We offer a novel perspective on reward modeling by formulating it as a policy discriminator, which quantifies the difference between two policies to generate a reward signal, guiding the training policy towards a target policy with desired behaviors. Based on this conceptual insight, we propose a scalable pre-training method named POLicy DiscriminAtive LeaRning (POLAR), which trains a reward model (RM) to discern identical policies and discriminate different ones. Unlike traditional reward modeling methods relying on absolute preferences, POLAR captures the relative difference between one policy and an arbitrary target policy, which is a scalable, high-level optimization objective suitable for modeling generic ranking relationships. Leveraging the POLAR pre-training paradigm, we present a series of RMs with parameter scales from 1.8B to 7B. Empirical results show that POLAR substantially outperforms traditional non-pre-trained methods, significantly enhancing RM performance. For instance, POLAR-7B could improve preference accuracy from 54.8% to 81.0% on STEM tasks and from 57.9% to 85.5% on creative writing tasks compared to SOTA baselines. POLAR also shows robust generalization capabilities in RLHF using Reinforcement Finetuning (RFT), providing reliable reward signals and markedly enhancing policy performance--improving LLaMa3.1-8B

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Asia (0.67)
North America > United States > Minnesota (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Research Report > New Finding (0.87)

Industry:

Education (0.46)
Media (0.45)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

In-Context Learning of Linear Dynamical Systems with Transformers: Approximation Bounds and Depth-separation

Neural Information Processing SystemsJun-23-2026, 03:23:41 GMT

This paper investigates approximation-theoretic aspects of the in-context learning capability of the transformers in representing a family of noisy linear dynamical systems. Our first theoretical result establishes an upper bound on the approximation error of multi-layer transformers with respect to an L2-testing loss uniformly defined across tasks. This result demonstrates that transformers with logarithmic depth can achieve error bounds comparable with those of the least-squares estimator. In contrast, our second result establishes a non-diminishing lower bound on the approximation error for a class of single-layer linear transformers, which suggests a depth-separation phenomenon for transformers in the in-context learning of dynamical systems.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(4 more...)

Add feedback

DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation

Neural Information Processing SystemsJun-23-2026, 03:13:14 GMT

Retrieval-augmented generation (RAG) systems combine large language models (LLMs) with external knowledge retrieval, making them highly effective for knowledge-intensive tasks. A crucial but often under-explored component of these systems is the reranker. Since irrelevant documents in RAG systems can mislead the generator, the reranker plays a vital role in refining retrieved documents to enhance generation quality and explainability. However, it is challenging to determine the appropriate number of documents (k) that the reranker should select: too few may result in missing critical information, while too many introduce noise and inefficiencies. Although recent studies have explored LLM-based rerankers, they primarily leverage internal model knowledge and overlook the rich supervisory signals that LLMs can provide, such as using response quality as feedback for optimizing reranking decisions. In this paper, we propose DynamicRAG, a novel RAG framework where the reranker dynamically adjusts both the order and number of retrieved documents based on the query. We model the reranker as an agent optimized through reinforcement learning (RL), using rewards derived from LLM output quality. Across seven knowledge-intensive datasets, DynamicRAG demonstrates superior performance, achieving state-of-the-art results among models of same parameter sizes.

computational linguistic, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: